feat: add support for custom judges via evaluation metric key #86

knfreemLD · 2026-01-21T15:01:01Z

Requirements

I have added test coverage for new or changed functionality
I have followed the repository's pull request submission guidelines
I have validated my changes against all supported platform versions

Related issues

https://launchdarkly.atlassian.net/browse/REL-11511
See tech spec at https://docs.google.com/document/d/1lzYwQqCcTzN_2zkxJZDfJtgUcEJ4jbpx0KSsJ2bRENw/edit?tab=t.0#heading=h.69bdm7karsxh

Describe the solution you've provided

Updating the SDK to check the AI Config's evaluationMetricKey property which now exists. Also added missing tests from previous implementation, and fallback to the original evaluationMetricKeys list.

Describe alternatives you've considered

Provide a clear and concise description of any alternative solutions or features you've considered.

Additional context

Add any other context about the pull request here.

Note

Implements single-key judge evaluation with backward compatibility and comprehensive tests.

Switches judge configs to use evaluationMetricKey (deprecated evaluationMetricKeys), updating AIJudgeConfig(Default) serialization
LDAIClient.__evaluate now returns the raw variation; judge_config extracts evaluationMetricKey with fallback to first in evaluationMetricKeys
Judge updated to validate and parse a single metric; EvaluationSchemaBuilder builds a single-key structured schema; minor cleanup of unused imports/comments
Adds extensive unit tests for judge behavior, schema building, and client extraction (including consistency of single variation, sampling, error paths)

^{Written by Cursor Bugbot for commit c6d086a. This will update automatically on new commits. Configure here.}

packages/sdk/server-ai/src/ldai/client.py

jsonbailey · 2026-01-21T16:31:24Z

packages/sdk/server-ai/src/ldai/models.py

    Default Judge-specific AI Config with required evaluation metric key.
    """
    messages: Optional[List[LDMessage]] = None
+    # Deprecated: evaluation_metric_key is used instead


Since we are sub 1.0 release as long as we can guarantee the api is always returning the new single key we should be able to just drop this and do a breaking change. They only thing that really makes this breaking is people will need to update their defaults if they defined it. If you want to drop it now update the PR to be "feat!: ".

I won't block if you want to leave this in for a little while but it likely isn't necessary. The real question is how long do we want to continue sending the old values in the API as that is what will break older SDKs.

For now we want to make sure this is non-breaking, but soon we're going to remove "legacy" support. For keeping this change as minimal and safe as possible I'd err on the side of caution and keep it in for the time being.

…port

Add support for custom judges via evaluation metric key

8d01693

knfreemLD changed the title ~~[REL-11511] Add support for custom judges via evaluation metric key~~ feat: add support for custom judges via evaluation metric key Jan 21, 2026

knfreemLD added 2 commits January 21, 2026 10:12

fixed linter issues

350f884

Linting

00a265e

knfreemLD requested a review from jsonbailey January 21, 2026 15:21

knfreemLD marked this pull request as ready for review January 21, 2026 15:43

knfreemLD requested a review from a team as a code owner January 21, 2026 15:43

knfreemLD requested review from andrewklatzke and mattrmc1 January 21, 2026 15:44

jsonbailey requested changes Jan 21, 2026

View reviewed changes

knfreemLD added 2 commits January 21, 2026 12:27

Addressed PR feedback; fixed race condition

d277b49

modified default behaviour

d67d0ab

knfreemLD requested a review from jsonbailey January 21, 2026 17:45

jsonbailey approved these changes Jan 21, 2026

View reviewed changes

mattrmc1 approved these changes Jan 21, 2026

View reviewed changes

knfreemLD mentioned this pull request Jan 22, 2026

feat: Added custom judge support for ai configs launchdarkly/js-core#1073

Open

3 tasks

Merge branch 'main' into kfreeman/REL-11511/evaluation-metric-key-sup…

c6d086a

…port

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add support for custom judges via evaluation metric key #86

feat: add support for custom judges via evaluation metric key #86

knfreemLD commented Jan 21, 2026 •

edited by cursor bot

Loading

Uh oh!

Uh oh!

Uh oh!

jsonbailey Jan 21, 2026

Uh oh!

knfreemLD Jan 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: add support for custom judges via evaluation metric key #86

Are you sure you want to change the base?

feat: add support for custom judges via evaluation metric key #86

Conversation

knfreemLD commented Jan 21, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jsonbailey Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

knfreemLD Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

knfreemLD commented Jan 21, 2026 •

edited by cursor bot

Loading